Skip to content

hum tee dum tee#43

Open
adiled wants to merge 20 commits into
mainfrom
hum-paths-crate
Open

hum tee dum tee#43
adiled wants to merge 20 commits into
mainfrom
hum-paths-crate

Conversation

@adiled
Copy link
Copy Markdown
Owner

@adiled adiled commented May 30, 2026

No description provided.

adiled added 20 commits May 30, 2026 15:15
New hum-paths crate. init() sets XDG defaults at startup; every callsite
routes through it. No /tmp fallback. Socket moves to $XDG_STATE_HOME/hum.
Issue #41 root cause was Arc<Mutex<Option<Child>>> + try_lock in
claude-cli — wait task held the lock forever, kill closure's try_lock
always failed, SIGKILL never reached the child.

- nest::lifecycle::supervise(AsyncGroupChild) using tokio::select! over
  child.wait() vs CancellationToken. command-group for tree-kill so
  claude's descendants die too. Reaps with timeout.
- Cell.kill: Arc<dyn Fn> -> Cell.cancel: CancellationToken.
- Drop on CellBundle cancels on map removal; idle reaper + LRU evict
  + map clear all kill correctly.
- claude-cli + claude-repl + mock + nest::pool + serve.rs migrated.
- lru::LruCache replaces HashMap + linear-scan eviction (O(1)).
- nest::metrics swapped /proc parser for sysinfo (cross-platform RSS/CPU).
- metrics + metrics-exporter-prometheus on humd; /metrics on 127.0.0.1:9909
  (HUM_METRICS_ADDR override). Counters at evict + kill sites; gauge for
  active cells.
- governor token bucket on thrumd accept loop (100/s).
- Reconnect jitter on serve_worker reconnect to spread thundering herds.
- 3 lifecycle tests: cancel kills, natural exit propagates, tree-kill
  takes grandchild.
DaemonConfig::from_env() (and a few other early callers) hit hum_paths
functions before any init() ran. Tests panicked. Making xdg() call init()
on first miss makes init() optional everywhere; explicit init() at bin
entry stays as an eager warmup so child processes inherit the env.
Dead modules removed (1500 LOC):
- nest::pool (Nest, never used)
- nest::mock (only used by pool tests)
- nest::health (tiered eviction policy, no callers)
- nest::budget (token/tool-call caps, no callers)
- nest::Listener + nest::ForagerBee traits (Listener impl in serve.rs
  was empty stubs; ForagerBee had zero impls)

Wired:
- nest::metrics::spawn_sampler now drives a hum_cell_rss_bytes /
  hum_cell_cpu_ms gauge labelled by pid from inside lifecycle::supervise.
  Per-cell observability stops being a half-built feature.
- humd.metricsAddr config knob (hum.json) replaces the hardcoded
  127.0.0.1:9909. Schema entry added.

Naming standardized: hum_cells_active -> hum_cell_count.

Dead code removed in hives/common/src/serve.rs (HashMap import,
WireListener Listener impl), humd/src/lib.rs (HumdSink.cli_path),
hum/src/main.rs (home() helper).

Tests added:
- serve::tests::drop_cancels_token — RAII fires on drop
- serve::tests::lru_pop_drops_bundle_and_cancels — LRU eviction kills
- serve::tests::map_clear_cancels_all — shutdown kills everything
- humd::prometheus_endpoint — /metrics actually serves
- thrumd::accept_rate_limit — governor paces accepts at quota
EOF
…urged, flaky test budgeted

- nest::ResourceLimits now actually applied: claude-cli calls
  apply_pre_exec on the std::process::Command (via cmd.as_std_mut())
  before group_spawn. SpawnSpec.resource_limits stops being decorative.
- CatalogueSlot.sid: stored but never read. Field + setter param dropped.
- humfs::tools::read::is_code_file: production-unused fn with tests
  testing nothing the codebase uses. Both purged.
- partition_then_heal_converges_wane: budget raised 1s -> 10s so the
  test is no longer load-flaky under parallel runs.

cargo check + cargo check --tests: zero warnings.
Rendezvous file (P1 #40 fix):
- hum_paths::RuntimeInfo + HumnestRuntimeInfo (atomic write, read, remove).
- thrumd::serve_with_hook fires on_bound after UnixListener::bind. humd
  uses it to publish runtime.json with socket + pid + version + bound_at_ms.
- hum_paths::thrum_sock_resolved prefers env > runtime.json > default.
  Daemons keep thrum_sock; bees/CLI use _resolved. Socket-path drift is
  gone — clients always reach whatever humd actually bound.
- hum doctor connect-tests with a real hello tone, 1s timeout. exists()
  check replaced. doctor surfaces humnest runtime.json too.
- hum bee --list flags "⚠ crash-looping (exit N)" via new svc_last_exit
  reading systemctl ExecMainStatus / launchctl 'last exit code'.

humnest crate — bee supervisor sibling daemon:
- Reads humnest.bees[] from hum.json.
- Each bee spawned via nest::lifecycle::supervise (group-kill, RAII).
- Restart policy per bee: always | on-failure (max_retries, backoff) | never.
- Crash-loop state surfaced through humnest-list RPC.
- Control socket at humnest.sock (NDJSON): humnest-spawn|humnest-kill|humnest-list.
- Sibling to humd: humd crash != humnest crash, bees stay alive.

humctl crate — service-manager 0.7 wrapper:
- humctl {install|start|stop|restart|status|uninstall} {humd|humnest}.
- Pure Rust, ServiceLevel::User, cross-platform via service-manager crate.
- scripts/svc.sh DELETED. 300 lines of bash, gone.
- ./install + every hives/*/install rewritten to call humctl, drop svc.sh
  sourcing. Hive install now appends to hum.json humnest.bees + restarts
  humnest instead of generating per-bee systemd/launchd units.

config gained humnest.bees[] schema. hum CLI: new 'hum nest' subcommand
lists humnest bees with state + restart count. 'hum bee enter|exit|reenter'
routes through humnest first for kinds in hum.json, falls back to legacy
svc paths for unknown / 'all' targets.

247 tests pass.
Same priority order as hum_paths::thrum_sock_resolved on the Rust side:
HUM_THRUM_SOCK > runtime.json > computed default. Closes the last gap
where non-Rust clients would miss humd's published socket path after
restart.
humctl shrinks to humd-only operator (no install/uninstall — bootstrap
owns service registration). New verbs: start, stop, restart, status,
logs, health. Status shells systemctl/launchctl native; health does a
real connect + hello/breath probe through hum_paths::thrum_sock_resolved.

hum_paths::macos_log is replaced by daemon_logs(name) returning a typed
DaemonLogs { Journald{unit} | Files{stdout,stderr} } so callers dispatch
on the enum instead of #[cfg]-ing per platform. humctl logs and hum's
print_recent_logs now share the same path.

install bash generates humd.service + humnest.service inline (Linux) or
humd.plist + humnest.plist (macOS), coupling expressed in the unit
files themselves (Wants= / PartOf= / RunAtLoad). svc.sh stays deleted.

SIGHUP reload handler I had added to humnest is reverted — it was a
coarse 'reconcile against disk' plaster. surgical RPC (humnest-spawn /
humnest-kill via the existing control socket) is the right primitive
and hum_route_verb already uses it.
humnest is gone. orchd (sibling Rust project; user-scope systemd +
launchd platform contributed upstream this round) replaces it.

Architecture now:
  launchd/systemd
    └── humd (one user service; humctl operates; ./install registers)
  orchd
    └── per-bee user units (one each, generated from per-hive Orchfile)

Hive surface: each hive ships an Orchfile at its root. `hum hive install
<target>` resolves the target, builds (cargo install --path), copies the
Orchfile into ~/.config/hum/orch.d/, re-assembles ~/.config/hum/Orchfile,
and runs `orchd up <kind>`. No more per-hive install bash scripts.

What dies:
- humnest crate (entire thing — 4 modules, supervisor + control + log + lib)
- hum_paths::humnest_sock, humnest_runtime, HumnestRuntimeInfo
- config::HumnestSection, BeeConfig (orch's Orchfile owns this declaration)
- hum.schema.json humnest section
- 4 hives/*/install bash scripts (claude-cli, claude-repl, humfs, paid-oracle)
- ./install humnest unit generation (no second user unit)

What lives:
- humd, humctl, hum CLI: unchanged in role
- humctl: humd operator only (start/stop/status/logs/health)
- orch + orchd: pulled as git deps by ./install via cargo install

What's new:
- hives/{claude-cli,claude-repl,humfs,paid-oracle}/Orchfile (declarative)
- hum CLI: hive_install does build + register-via-Orchfile + orchd up
- hum CLI: bee enter/exit/reenter routes through orchd up/down/restart
- hum CLI: 'hum nest' delegates to 'orchd status'
- ./install pulls orch + orchd from github via cargo install

openai-server (TS) still has its bash install — TS build pipeline
(pnpm install + tsup) not yet automated through hum hive install. Follow-up.

247 tests pass.
Detects build kind by marker file:
- Cargo.toml         → cargo install --path
- package.json       → pnpm/npm install + build; writes ~/.local/bin/<kind>
                       as a node wrapper exec'ing the produced dist/index.js
                       (honors pre-built dist if no pnpm/npm in PATH)
- go.mod             → go build -o ~/.local/bin/<kind>
- 'build' script     → execute it (escape hatch for exotic hives)

Orchfiles added for the remaining hives (every hive now ships one):
- bp7, grpc, gsm-modem, ollama-server (Rust foragers; bp7-forager,
  grpc-forager, ollama-server, gsm-modem binaries)
- openai-server, anthropic-server, vercel-ai (TS HTTP foragers)
- twilio-sms (Go forager)

12 hives total, all surface-uniform: ~/.local/bin/<kind> + Orchfile.

hives/openai-server/install bash deleted — the build pipeline (pnpm
install + tsup build + node wrapper) now lives in hum CLI's
build_node(). One code path for all TS hives.

247 tests pass.
Constitution-level rename. Where hum owns the names, they sing.

  SpawnSpec      → Egg                  (the thing-to-be a worker raises)
  WorkerBee::spawn → ::raise            (workers raise brood)
  Cell.pid       → Cell.mark            (beekeeper's mark)
  Cell.stdin     → Cell.feed            (nurses feed larvae through open cells)
  Cell.events    → Cell.mmm             (the sounds from inside the cell)
  Cell.exited    → Cell.emerged         (adult bee chews out the cap and emerges)
  Cell.cancel    → Cell.silence + Cell.still()  (token field, verb method)
  ResourceLimits → Bounds
  resource_limits→ bounds
  CellMetrics    → Vitals
  Attachment     → Pollen               (what a forager carries home)
  encode_prompt_with_attachments → encode_prompt_with_pollen
  nest::lifecycle::supervise → ::tend   (nurse bees tend brood cells)

Foreign types stay foreign (CancellationToken, mpsc::Sender,
AsyncGroupChild, tokio::process::Command, oneshot::Receiver). We don't
own those words. Our surface either reads like apiary biology or stays
out of the way.

247 tests pass.
ids/HumId newtype replaces stringly-typed identifiers everywhere hum
mints them. Foreign formats are projections via deterministic transforms.

ids crate
  - HumId(String) newtype with Serialize/Deserialize/Display/FromStr/AsRef<str>
  - HumId::mint() — ts-prefixed random; sessions, requests, calls
  - HumId::from_hash([u8; 32]) — pure hash encoding
  - HumId::from_foreign(&str) — deterministic projection; same input → same id
  - to_uuid_v5(NS_CLAUDE_SESSION) — outgoing UUID for claude --session-id
  - NS_CLAUDE_SESSION preserves the historical HUM_SESSION_NS bytes so
    existing claude transcripts stay reachable

Type signature changes (internal)
  - nest::Egg.sid: HumId  (was String)
  - nest::Egg.session_id REMOVED  (derived from sid.to_uuid_v5 inside claude-cli)
  - nest::Egg.fresh: bool ADDED  (resume vs --session-id flag for the worker)

Mint sites canonicalized
  - thrumd connection id (cid) — HumId::mint() (was uuid::Uuid::new_v4)
  - thrum_core::rid() returns HumId-format (was tsBase36-counterBase36)
  - hives/common/mcp_bridge call_id — thrum_core::rid() (no more 'call-' prefix)
  - all hive hello rids: bp7, grpc, gsm-modem, paid-oracle, ollama-server, ensemble
  - ollama-server per-request sid — HumId::mint()
  - all sim test cid/rid fixtures

Boundary
  - hives/common/serve.rs canonicalizes incoming wire sid:
    HumId::parse(&s).unwrap_or_else(|_| HumId::from_foreign(&s))
  - dead HUM_SESSION_NS + sid_to_session helper removed

hum-paths bonus
  - config::denied() uses hum_paths::config_dir() instead of '~/.config/hum'
    literal (now correct under XDG_CONFIG_HOME override)
  - hum CLI module doc adds 'hum nest' + 'hum update'; drops stale
    'Inspection-only for 0.3'; fixes bees.json comment

Untouched (by design)
  - Hid (humd/bee identity, sha256(pubkey) hex with role prefix) — different beast
  - Wire types stay String; humd canonicalizes at entry handlers
  - kad query rids (derived from query_id for cross-hop correlation)
  - Cryptographic nonce in paid-oracle
Every humd maintains a signed append-only NDJSON ring of every chi event
it observes. The log is the only authoritative store of activity;
bees.json, sid -> bee routes, session state are projections via
thehum.replay().

New crate: thehum
- Event: chi + sid + rid + body + author hid + seq + ts_ms + prev_hash + sig
- HumId::mint() rids; humd.key signs; sha256 hash chain
- TheHum::open / append / tail / range / replay / snapshot / enforce_retention / anchor
- Retention modes: archive, rolling, light (configurable per humd)
- Snapshots: Merkle root over BTreeMap of state leaves, emitted as
  chi:snapshot into own log
- AnchorBackend trait + EvmAnchor scaffold for on-chain commitments
  (production users wire their own signer)
- Sorted-key canonical JSON for hashing+signing; sig stripped from
  canonical bytes so chain integrity is sig-agnostic
- 24 unit tests + 7 end-to-end tests pass

Integration
- humd opens TheHum from humd_key + hum_paths::thehum_dir() at boot
- Replays the log to rebuild bee manifests (pure handler; reads
  event.ts_ms not now())
- Appends every incoming tone before routing (chi log -> handler chain)
- Spawns 30s snapshot+retention background task
- New chi:Backfill handler responds with thehum.range(author, from)
- Backfill enum variant added to thrum-core/chi.rs

Determinism scrub
- humd: nestler_id fallback uses client_id (was SystemTime ms)
- humd: cwd fallback is "/" (was env::var HOME) — replay must not read env
- humd: sorted iteration in worker selection + forager tool catalogue
  during chi:Prompt handler (was HashMap order)

Deletions
- hums crate (vestigial; was Hums::load() with discarded result)
- hum_paths::hums_json (no callers)

New surfaces
- hum thehum {status|tail|range|verify|replay}
- humctl thehum (health check, no daemon needed)
- thehum::layout module — single source of truth for seq.bin / snapshots/
  / root.txt names (no literal filenames outside thehum crate)

Tests: 275 passed across the workspace (was 247).
Promote every path construction + every literal basename across the
workspace into hum-paths. Single source of truth for everything hum
reads or writes; nothing constructs a hum filename outside this crate.

New constants (canonical basenames):
  THRUM_SOCK_BASENAME, HTTP_SOCK_BASENAME, PENNY_BASENAME,
  HUMD_KEY_BASENAME, BEES_SNAPSHOT_BASENAME, RUNTIME_INFO_BASENAME,
  HUM_JSON_BASENAME, PEERS_JSON_BASENAME, ORCHFILE_BASENAME,
  HIVES_SUBDIR, RECIPES_SUBDIR, HIVE_INSTALL_SCRIPT

New helpers (composed paths):
  home(), expand_tilde(p) — single tilde-expander for user config
  local_dir(), local_bin_dir() — $HOME/.local + .local/bin
  hum_bin(name) — installed-binary location for a hum binary
  fnm_node_bin() — fnm-managed node fallback
  claude_data_dir(), claude_session_dir(cwd_hash) — claude CLI layout
  orch_d_dir(), orchfile() — orchd integration files
  foreign_hive_cache(org, repo, branch) — github-source clone cache
  svc_script() — scripts/svc.sh shipped with the source clone

Migration:
- config::expand_tilde delegates to hum_paths::expand_tilde
- hives/{claude-cli,claude-repl} propagate HOME via hum_paths::home()
- claude-cli/graft uses claude_session_dir(cwd_hash)
- hum CLI's home_local() / hum_orchfile() helpers deleted; callers use
  hum_paths::{local_dir, local_bin_dir, hum_bin, orchfile, orch_d_dir,
  foreign_hive_cache, svc_script, HIVES_SUBDIR, RECIPES_SUBDIR,
  HIVE_INSTALL_SCRIPT, ORCHFILE_BASENAME}
- sim, penny, humd/peers tests reach for hum_paths::*_BASENAME and
  hum_paths::peers_json() instead of literal filenames
- humd/peers test uses hum_paths::config_dir() after XDG_CONFIG_HOME
  override (was reconstructing 'hum' subdir manually)

The only HOME / XDG_* reads left in the workspace:
- hum-paths itself (the source of truth)
- doctor diagnostics (reports raw env values to the user)
- test isolation (set_var XDG_* on tmp dirs)
All routine code reaches for hum_paths instead.

275 tests pass.
b25ddc3 killed scripts/svc.sh but the CLI kept calling its functions
through dead bash shell-outs (svc_helper, svc_active, svc_last_exit,
svc_start/stop/restart, svc_list, svc_uninstall, svc_status). Removing
all of it.

Deletions:
- svc_helper() — scripts/svc.sh discovery
- svc_active(), svc_last_exit() — bash exit-code probes
- bee_list(svc) — bash svc_list scraper
- resolve_units() — unit name resolver only used by the bash path
- hum_paths::svc_script() — no callers after this commit
- All bash shell-outs to svc_start/stop/restart/uninstall/status

Rewrites:
- bee_list_full now takes a Vec<String> from orch_catalog() (installed
  hive kinds), not svc-discovered unit names. State printed as
  'orchd-managed' / 'unmanaged' / 'installed, not handshaked' based on
  presence in humd's bees.json + orch catalog.
- bee() routes every verb through orch_route_verb. No bash fallback.
- hive_list() marks running kinds from orch_catalog() (no svc_list).
- uninstall() calls 'humctl stop' instead of bash svc_uninstall.
- status() drops the trailing 'svc_status hum' bash call (humctl status
  is the canonical surface now).

275 tests pass.
thehum::layout module merged into hum-paths (single source of truth):
  pub const THEHUM_SEQ_BASENAME / THEHUM_SNAPSHOTS_SUBDIR /
           THEHUM_ROOT_BASENAME / THEHUM_NDJSON_EXT
  pub fn thehum_seq_file / thehum_snapshots_dir / thehum_root_file
hum + humctl + thehum-internal callers updated.

Compiler-flagged dead code (zero remaining):
- removed gsm-modem::now_ms (never used after the hello-rid migration
  to ids::HumId::mint)
- removed humd's stale "see TODO in nest::pool::Nest" comment — pool
  module was deleted in the orchd adoption

Stale TS-era references rewritten:
- claude-repl module doc dropped "Real behavior in TS lives in
  harness.ts" (Rust IS the implementation now); unused FSM variants
  Hunting/Wilting/Hushed deleted
- ollama-server `images: Option<Vec<String>>` field deleted (was an
  #[allow(dead_code)] placeholder for a feature that never landed)
- hives/common/serve module doc points at current Cell API (cell.mmm,
  cell.still, raise instead of spawn)
- thrum-core envelope/prim docs: drop "TS daemon" / "TS wire shape"
  framing
- thrumd::thrum_broadcast: drop "matches the TS daemon's routing"
- drone::Health: drop "the TS Assessment strings"
- claude-cli/graft: drop "TS writer" / "like the TS does"
- ids tests: drop "lib/id.ts encodeBase32" reference
- penny load test: drop "TS shape" wording

Test pass: 275.
Drove from rustc's unreachable_pub lint with RUSTFLAGS=-W unreachable_pub
across the workspace. 68 items downgraded:

  ensemble/kad.rs      6   internal kad helpers (ParsedFindNode*, etc)
  thrumd/conn.rs       1
  thrumd/registry.rs   7   internal sigil-broadcast types
  config/lib.rs        7   defaults::* helpers
  humd/peers.rs        1   load() called only from boot
  hives/common/mcp_bridge.rs  8  test-side reqwest_lite helpers
  hives/humfs (ast/* + tools/* + dispatch.rs)  38  intra-crate types

hum-paths was excluded by design — every helper there is meant to be
called from anywhere in the workspace, current consumers or not.

275 tests still pass.
paths.rs holds every "../thrum-clients/{ts,python,go}/..." literal.
Both `cargo run -p codegen` and `thrum-core/build.rs` route through
codegen::paths instead of carrying their own copies, so a rename can't
leave one site stale.

While here: --check now covers all three targets (ts, python, go) and
no-arg `cargo run -p codegen` regenerates all of them. The previous
CLI silently dropped python and go even though build.rs emitted them.
The library had every piece. humd was only plugged into the outbound
half, and only over TCP. Two real humds could not meet.

humd/src/peer_transport/ now owns the daemon-side plumbing as two
sibling modules:

  iroh.rs   bind(humd_key) returns an IrohTransport whose NodeId is
            pinned to the persistent HumdKey, so the signed hello
            verifies against the iroh-routed identity. dial_all walks
            peers.json for iroh: hints; spawn_listener detaches an
            accept loop. Both paths land at Ensemble::install (signed).

  tcp.rs    Sibling shape. spawn_listener binds humd.tcpListen and
            accepts; dial_all walks tcp: hints. Plaintext NDJSON; the
            signed hello is the only authentication on the wire.

Surrounding changes that came up while finishing this:

  ensemble::IrohTransport::bind_direct_with_key(&HumdKey)
    New constructor. Pins iroh's SecretKey to our HumdKey so NodeId
    == pubkey and Hid == sha256(pubkey) collapse to one identity.

  ensemble drainer re-keys peers on Verified hello
    TCP accept() returns a Hid::random_humd() placeholder because
    plain TCP can't authenticate the peer pre-handshake. The drainer
    was rejecting every inbound signed hello on `claimed_id == id`.
    Verified hellos now move the registry entry from the placeholder
    to the cryptographically-verified id. iroh path unchanged (its
    NodeId matches claimed_id by construction). Unsigned hellos still
    require id match, since the sig is what makes re-key safe.

  humd::identity::read_key
    Read-only loader so `hum ensemble` can show the daemon's id
    without minting one as a side effect.

  config: humd.tcpListen
    Optional "host:port" in hum.json. Omitted = dial-only over TCP.
    iroh is independent of this setting (always tried).

  RuntimeInfo.ensemble_addrs
    Populated with iroh: + iroh-ip: (and tcp: when listening) hints.
    A peer copies these into its peers.json to dial this humd back.

  hum CLI: ensemble subcommand
    `hum ensemble`               show me + reach + configured peers
    `hum ensemble peer add ...`  append entry to peers.json (atomic)
    `hum ensemble peer rm  ...`  drop by humd_id or alias

Tests:

  peer_transport::iroh::tests   real iroh QUIC round-trip, both
                                sides end with each other in
                                Ensemble::peers().
  peer_transport::tcp::tests    same shape, exercises the re-key
                                path on the accept side.
  46 ensemble tests, 277 workspace tests still green.

What's still cold but unblocked: live peer state in `hum ensemble`
needs an admin-tone RPC into the running daemon. Today the CLI reads
on-disk artifacts only. Kad lookup is fed by install but no caller
yet queries it. TLS transport is library-only. Iroh's relay/WAN bind
path exists but humd boots with bind_direct (loopback/LAN only).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant